团体公平指标是评估基于预测决策系统公平性的既定方法。但是,这些指标仍然与哲学理论相关,其道德含义通常不清楚。我们提出了一个一般框架,用于分析基于分配正义理论的决策系统的公平性,包括与不同规范立场相对应的不同既定的“正义模式”。我们表明,最受欢迎的集体公平度量指标可以解释为我们方法的特殊情况。因此,我们为群体公平指标提供了一个统一和解释的框架,该框架揭示了与它们相关的规范性选择,并允许理解其道德实质。同时,我们提供了可能公平指标的延伸空间,而不是公平ML文献中当前讨论的指标。我们的框架还允许克服几个群体公平指标的局限性,这些指标在文献中受到批评,最著名的是(1)它们是基于平等的,即他们要求群体之间的某种形式的平等性,这有时可能有害于边缘化。组,(2)他们仅比较跨群体的决策,但对这些群体的后果没有比较,并且(3)分配正义文献的全部广度不足。
translated by 谷歌翻译
在基于预测的决策系统中,不同的观点可能是矛盾的:决策者的短期业务目标通常与决策主体的愿望相抵触。平衡这两个观点是一个价值问题。我们提供一个框架,使这些具有价值的选择清晰可见。为此,我们假设我们获得了一个训练有素的模型,并希望找到平衡决策者和决策主体观点的决策规则。我们提供了一种形式化这两种观点的方法,即评估决策者的效用和对决策主体的公平性。在这两种情况下,这个想法都是从决策者和决策主题中引起的价值观,然后将其变成可衡量的东西。为了进行公平评估,我们以基于福利的公平性的文献为基础,并询问公用事业(或福利)的公平分布是什么样的。在此步骤中,我们以分配正义的著名理论为基础。这使我们能够得出一个公平分数,然后将其与许多不同决策规则的决策者实用程序进行比较。这样,我们提供了一种平衡决策者的实用性的方法,以及对基于预测的决策系统的决策主体的公平性。
translated by 谷歌翻译
在本文中,我们对机器学习文献中统计公平性辩论的两个标准进行道德分析:1)组之间的校准和2)组之间的假阳性和假负率的平等。在我们的论文中,我们专注于支持任何一种措施的道德论点。群体校准与假阳性和假负率平等之间的冲突是有关从业者团体公平定义的辩论中的核心问题之一。对于任何彻底的道德分析,必须明确并正确定义公平性的含义。对于我们的论文,我们将公平等同于(非)歧视,这是关于群体公平的讨论中的合理理解。更具体地说,我们将其等同于表面上的错误歧视,从某种意义上说,这用于Lippert-Rasmussen教授对此定义的处理。在本文中,我们认为违反群体校准的行为在某些情况下可能是不公平的,但在其他情况下并不公平。这与文献中已经提出的主张一致,即应以对上下文敏感的方式定义算法公平性。最重要的实际含义是基于基于示例的论点,即公平性需要组间校准或假阳性/假阴性率的平等性,并没有概括。因为在一种情况下,组校准可能是公平的要求,而不是另一种情况。
translated by 谷歌翻译
Deep-learning of artificial neural networks (ANNs) is creating highly functional tools that are, unfortunately, as hard to interpret as their natural counterparts. While it is possible to identify functional modules in natural brains using technologies such as fMRI, we do not have at our disposal similarly robust methods for artificial neural networks. Ideally, understanding which parts of an artificial neural network perform what function might help us to address a number of vexing problems in ANN research, such as catastrophic forgetting and overfitting. Furthermore, revealing a network's modularity could improve our trust in them by making these black boxes more transparent. Here we introduce a new information-theoretic concept that proves useful in understanding and analyzing a network's functional modularity: the relay information $I_R$. The relay information measures how much information groups of neurons that participate in a particular function (modules) relay from inputs to outputs. Combined with a greedy search algorithm, relay information can be used to {\em identify} computational modules in neural networks. We also show that the functionality of modules correlates with the amount of relay information they carry.
translated by 谷歌翻译
Cashews are grown by over 3 million smallholders in more than 40 countries worldwide as a principal source of income. As the third largest cashew producer in Africa, Benin has nearly 200,000 smallholder cashew growers contributing 15% of the country's national export earnings. However, a lack of information on where and how cashew trees grow across the country hinders decision-making that could support increased cashew production and poverty alleviation. By leveraging 2.4-m Planet Basemaps and 0.5-m aerial imagery, newly developed deep learning algorithms, and large-scale ground truth datasets, we successfully produced the first national map of cashew in Benin and characterized the expansion of cashew plantations between 2015 and 2021. In particular, we developed a SpatioTemporal Classification with Attention (STCA) model to map the distribution of cashew plantations, which can fully capture texture information from discriminative time steps during a growing season. We further developed a Clustering Augmented Self-supervised Temporal Classification (CASTC) model to distinguish high-density versus low-density cashew plantations by automatic feature extraction and optimized clustering. Results show that the STCA model has an overall accuracy of 80% and the CASTC model achieved an overall accuracy of 77.9%. We found that the cashew area in Benin has doubled from 2015 to 2021 with 60% of new plantation development coming from cropland or fallow land, while encroachment of cashew plantations into protected areas has increased by 70%. Only half of cashew plantations were high-density in 2021, suggesting high potential for intensification. Our study illustrates the power of combining high-resolution remote sensing imagery and state-of-the-art deep learning algorithms to better understand tree crops in the heterogeneous smallholder landscape.
translated by 谷歌翻译
Local patterns play an important role in statistical physics as well as in image processing. Two-dimensional ordinal patterns were studied by Ribeiro et al. who determined permutation entropy and complexity in order to classify paintings and images of liquid crystals. Here we find that the 2 by 2 patterns of neighboring pixels come in three types. The statistics of these types, expressed by two parameters, contains the relevant information to describe and distinguish textures. The parameters are most stable and informative for isotropic structures.
translated by 谷歌翻译
It is well known that conservative mechanical systems exhibit local oscillatory behaviours due to their elastic and gravitational potentials, which completely characterise these periodic motions together with the inertial properties of the system. The classification of these periodic behaviours and their geometric characterisation are in an on-going secular debate, which recently led to the so-called eigenmanifold theory. The eigenmanifold characterises nonlinear oscillations as a generalisation of linear eigenspaces. With the motivation of performing periodic tasks efficiently, we use tools coming from this theory to construct an optimization problem aimed at inducing desired closed-loop oscillations through a state feedback law. We solve the constructed optimization problem via gradient-descent methods involving neural networks. Extensive simulations show the validity of the approach.
translated by 谷歌翻译
Artificial intelligence(AI) systems based on deep neural networks (DNNs) and machine learning (ML) algorithms are increasingly used to solve critical problems in bioinformatics, biomedical informatics, and precision medicine. However, complex DNN or ML models that are unavoidably opaque and perceived as black-box methods, may not be able to explain why and how they make certain decisions. Such black-box models are difficult to comprehend not only for targeted users and decision-makers but also for AI developers. Besides, in sensitive areas like healthcare, explainability and accountability are not only desirable properties of AI but also legal requirements -- especially when AI may have significant impacts on human lives. Explainable artificial intelligence (XAI) is an emerging field that aims to mitigate the opaqueness of black-box models and make it possible to interpret how AI systems make their decisions with transparency. An interpretable ML model can explain how it makes predictions and which factors affect the model's outcomes. The majority of state-of-the-art interpretable ML methods have been developed in a domain-agnostic way and originate from computer vision, automated reasoning, or even statistics. Many of these methods cannot be directly applied to bioinformatics problems, without prior customization, extension, and domain adoption. In this paper, we discuss the importance of explainability with a focus on bioinformatics. We analyse and comprehensively overview of model-specific and model-agnostic interpretable ML methods and tools. Via several case studies covering bioimaging, cancer genomics, and biomedical text mining, we show how bioinformatics research could benefit from XAI methods and how they could help improve decision fairness.
translated by 谷歌翻译
Generic Object Tracking (GOT) is the problem of tracking target objects, specified by bounding boxes in the first frame of a video. While the task has received much attention in the last decades, researchers have almost exclusively focused on the single object setting. Multi-object GOT benefits from a wider applicability, rendering it more attractive in real-world applications. We attribute the lack of research interest into this problem to the absence of suitable benchmarks. In this work, we introduce a new large-scale GOT benchmark, LaGOT, containing multiple annotated target objects per sequence. Our benchmark allows researchers to tackle key remaining challenges in GOT, aiming to increase robustness and reduce computation through joint tracking of multiple objects simultaneously. Furthermore, we propose a Transformer-based GOT tracker TaMOS capable of joint processing of multiple objects through shared computation. TaMOs achieves a 4x faster run-time in case of 10 concurrent objects compared to tracking each object independently and outperforms existing single object trackers on our new benchmark. Finally, TaMOs achieves highly competitive results on single-object GOT datasets, setting a new state-of-the-art on TrackingNet with a success rate AUC of 84.4%. Our benchmark, code, and trained models will be made publicly available.
translated by 谷歌翻译
In recent years the applications of machine learning models have increased rapidly, due to the large amount of available data and technological progress.While some domains like web analysis can benefit from this with only minor restrictions, other fields like in medicine with patient data are strongerregulated. In particular \emph{data privacy} plays an important role as recently highlighted by the trustworthy AI initiative of the EU or general privacy regulations in legislation. Another major challenge is, that the required training \emph{data is} often \emph{distributed} in terms of features or samples and unavailable for classicalbatch learning approaches. In 2016 Google came up with a framework, called \emph{Federated Learning} to solve both of these problems. We provide a brief overview on existing Methods and Applications in the field of vertical and horizontal \emph{Federated Learning}, as well as \emph{Fderated Transfer Learning}.
translated by 谷歌翻译